[Automatic Import] Safely access non-identifier fields in Painless if context#205220
[Automatic Import] Safely access non-identifier fields in Painless if context#205220ilyannn merged 9 commits intoelastic:mainfrom
Conversation
|
Pinging @elastic/security-scalability (Team:Security-Scalability) |
…into auto-import/quote-fields
| * @returns `true` if the string is a valid Painless identifier and not a reserved word, `false` otherwise. | ||
| */ | ||
| export function isPainlessIdentifier(s: string): boolean { | ||
| return PAINLESS_IDENTIFIER_REGEXP.test(s) && !PAINLESS_RESERVED_WORDS.has(s); |
There was a problem hiding this comment.
Is the check for painless reserved keywords necessary? I get the regex check, but for example
Map test = new HashMap();
test.if = 'Reserved keyword';
return test.if;
do not throw any error for me, using the Painless Lab
There was a problem hiding this comment.
That is a good point! I've taken this logic from the documentation that says:
Use an identifier as a named token to specify a ... field ...
and that a keyword cannot be an identifier, but perhaps the documentation is incomplete! I'll test whether this works specifically for the ingest pipelines and if yes I'll remove the check.
There was a problem hiding this comment.
Yes when accessing the field all identifiers are fine:
ctx.if?.host // ok
though this is still understood as if statement:
if.something?.host // "unexpected token ['.'] was expecting one of ['(']."
|
Starting backport for target branches: 8.16, 8.17, 8.x https://github.com/elastic/kibana/actions/runs/12600012995 |
… context (elastic#205220) Closes elastic#205024 We add utility functions to access nested fields in Painless in a safe way and modify the existing ECS generation logic to use them. This access happens using the `object?.get("field")` syntax for complex cases, while falling back to the familiar `ctx.field` for the cases where `field` is a valid Painless identifier and `ctx` is known to be non-nullable. This takes care of the compile-time correctness of field accesses. Note that it is still possible for generated pipelines to fail in runtime on unexpected input, e.g. accessing a nested field `a.b` fails for the document of the form `{"a": "string"}`. See the PR for more details, release note and test results. --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> (cherry picked from commit 1fb16c5)
💔 Some backports could not be created
Note: Successful backport PRs will be merged automatically after passing CI. Manual backportTo create the backport manually run: Questions ?Please refer to the Backport tool documentation |
…ess if context (#205220) (#205510) # Backport This will backport the following commits from `main` to `8.x`: - [[Automatic Import] Safely access non-identifier fields in Painless if context (#205220)](#205220) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Ilya Nikokoshev","email":"ilya.nikokoshev@elastic.co"},"sourceCommit":{"committedDate":"2025-01-03T15:31:46Z","message":"[Automatic Import] Safely access non-identifier fields in Painless if context (#205220)\n\nCloses https://github.com/elastic/kibana/issues/205024\n\nWe add utility functions to access nested fields in Painless in a safe\nway and modify the existing ECS generation logic to use them.\n\nThis access happens using the `object?.get(\"field\")` syntax for complex\ncases, while falling back to the familiar `ctx.field` for the cases\nwhere `field` is a valid Painless identifier and `ctx` is known to be\nnon-nullable.\n\nThis takes care of the compile-time correctness of field accesses. Note\nthat it is still possible for generated pipelines to fail in runtime on\nunexpected input, e.g. accessing a nested field `a.b` fails for the\ndocument of the form `{\"a\": \"string\"}`.\n\nSee the PR for more details, release note and test results.\n\n---------\n\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>","sha":"1fb16c59521a9bbbbc71d151f4de5fa57323c7e2","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["bug","release_note:fix","v9.0.0","backport:prev-major","Team:Security-Scalability","Feature:AutomaticImport"],"title":"[Automatic Import] Safely access non-identifier fields in Painless if context","number":205220,"url":"https://github.com/elastic/kibana/pull/205220","mergeCommit":{"message":"[Automatic Import] Safely access non-identifier fields in Painless if context (#205220)\n\nCloses https://github.com/elastic/kibana/issues/205024\n\nWe add utility functions to access nested fields in Painless in a safe\nway and modify the existing ECS generation logic to use them.\n\nThis access happens using the `object?.get(\"field\")` syntax for complex\ncases, while falling back to the familiar `ctx.field` for the cases\nwhere `field` is a valid Painless identifier and `ctx` is known to be\nnon-nullable.\n\nThis takes care of the compile-time correctness of field accesses. Note\nthat it is still possible for generated pipelines to fail in runtime on\nunexpected input, e.g. accessing a nested field `a.b` fails for the\ndocument of the form `{\"a\": \"string\"}`.\n\nSee the PR for more details, release note and test results.\n\n---------\n\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>","sha":"1fb16c59521a9bbbbc71d151f4de5fa57323c7e2"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/205220","number":205220,"mergeCommit":{"message":"[Automatic Import] Safely access non-identifier fields in Painless if context (#205220)\n\nCloses https://github.com/elastic/kibana/issues/205024\n\nWe add utility functions to access nested fields in Painless in a safe\nway and modify the existing ECS generation logic to use them.\n\nThis access happens using the `object?.get(\"field\")` syntax for complex\ncases, while falling back to the familiar `ctx.field` for the cases\nwhere `field` is a valid Painless identifier and `ctx` is known to be\nnon-nullable.\n\nThis takes care of the compile-time correctness of field accesses. Note\nthat it is still possible for generated pipelines to fail in runtime on\nunexpected input, e.g. accessing a nested field `a.b` fails for the\ndocument of the form `{\"a\": \"string\"}`.\n\nSee the PR for more details, release note and test results.\n\n---------\n\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>","sha":"1fb16c59521a9bbbbc71d151f4de5fa57323c7e2"}}]}] BACKPORT--> Co-authored-by: Ilya Nikokoshev <ilya.nikokoshev@elastic.co>
… context (elastic#205220) Closes elastic#205024 We add utility functions to access nested fields in Painless in a safe way and modify the existing ECS generation logic to use them. This access happens using the `object?.get("field")` syntax for complex cases, while falling back to the familiar `ctx.field` for the cases where `field` is a valid Painless identifier and `ctx` is known to be non-nullable. This takes care of the compile-time correctness of field accesses. Note that it is still possible for generated pipelines to fail in runtime on unexpected input, e.g. accessing a nested field `a.b` fails for the document of the form `{"a": "string"}`. See the PR for more details, release note and test results. --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Release Note
Fixes how Automatic Import generates accesses for the field names that are not valid Painless identifiers.
Summary
Closes #205024
We add utility functions to access nested fields in Painless in a safe way and modify the existing ECS generation logic to use them.
This access happens using the
object?.get("field")syntax for complex cases, while falling back to the familiarctx.fieldfor the cases wherefieldis a valid Painless identifier andctxis known to be non-nullable.In the future this should be taken care of by the new
$('a.b.c', defaultValue)accessor function (elastic/elasticsearch#101274). For now, it's not available:This takes care of the compile-time correctness of field accesses. Note that it is still possible for generated pipelines to fail in runtime on unexpected input, e.g. accessing a nested field
a.bfails for the document of the form{"a": "string"}.Testing
The two utility files we add are fully covered with unit tests:
.
Here's the generated package for logs containing only the
@timestampfield:Checklist
release_note:*label is applied per the guidelines